Towards a Multilingual Financial Narrative Processing System

نویسندگان

  • Mahmoud El-Haj
  • Paul Rayson
  • Paulo Alves
  • Steven Young
چکیده

Large scale financial narrative processing for UK annual reports has only become possible in the last few years with our prior work on automatically understanding and extracting the structure of unstructured PDF glossy reports. This has levelled the playing field somewhat relative to US research where annual reports (10-K Forms) have a rigid structure imposed on them by legislation and are submitted in plain text format. The structure extraction is just the first step in a pipeline of analyses to examine disclosure quality and change over time relative to financial results. In this paper, we describe and evaluate the use of similar Information Extraction and Natural Language Processing methods for extraction and analysis of annual financial reports in a second language (Portuguese) in order to evaluate the applicability of our techniques in another national context (Portugal). Extraction accuracy varies between languages with English exceeding 95%. To further examine the robustness of our techniques, we apply the extraction methods on a comprehensive sample of annual reports published by UK and Portuguese non-financial firms between 2003 and 2014.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Development of Multilingual Spoken Dialogue Systems

Developing multilingual dialogue systems brings up various challenges. Among them development of natural language understanding and generation components, with a focus on creating new language parts as rapidly as possible. Another challenge is to ensure compatibility between the different language specific components during maintenance and ongoing development of the system. We describe our expe...

متن کامل

Event Extraction and Temporal Ordering towards Narrative Model Generation

Narrative generation is the process of automatically generating narrative stories (fictional or not) through a narrative model. With the advances in machine learning and natural language processing techniques, narrative generators have been able to create more creative and interesting text in terms of narrative contents and story telling techniques. These systems have applications in a range of...

متن کامل

Grammar Sharing Techniques for Rule-based Multilingual NLP Systems

Rule-based multilingual natural language processing (NLP) applications such as machine translation systems require the development of grammars for multiple languages. Grammar writing, however, is often a slow and laborious process. In this paper we describe a methodology for multilingual and multipurpose grammar development based on grammar sharing. This paper presents the first step towards a ...

متن کامل

OMWEdit - The Integrated Open Multilingual Wordnet Editing System

Wordnets play a central role in many natural language processing tasks. This paper introduces a multilingual editing system for the Open Multilingual Wordnet (OMW: Bond and Foster, 2013). Wordnet development, like most lexicographic tasks, is slow and expensive. Moving away from the original Princeton Wordnet (Fellbaum, 1998) development workflow, wordnet creation and expansion has increasingly...

متن کامل

Towards a new level of anotation detail of multilingual speech corpora

The aim of this paper is to highlight the actual need for corpora that have been annotated based on acoustic information. The acoustic information should be coded in features or properties and is needed to inform further processing systems, i.e. to present a basis for a speech recognition system using linguistic information. Feature annotation of existing corpora in combination with segmental a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018